{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 搜索 - 黑白棋 AI 算法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1. 实验介绍  \n",
    "## 1.1 实验内容  \n",
    "黑白棋 (Reversi)，也叫苹果棋，翻转棋，是一个经典的策略性游戏。  \n",
    "\n",
    "一般棋子双面为黑白两色，故称“黑白棋”。因为行棋之时将对方棋子翻转，则变为己方棋子，故又称“翻转棋” (Reversi) 。  \n",
    "棋子双面为红、绿色的称为“苹果棋”。它使用 8x8 的棋盘，由两人执黑子和白子轮流下棋，最后子多方为胜方。  \n",
    "随着网络的普及，黑白棋作为一种最适合在网上玩的棋类游戏正在逐渐流行起来。  \n",
    "中国主要的黑白棋游戏站点有 Yahoo 游戏、中国游戏网、联众游戏等。  \n",
    "\n",
    "[黑白棋示范视频](https://v.youku.com/v_show/id_XMjYyMzc1Mjcy.html?spm=a2h0k.11417342.soresults.dtitle)\n",
    "可以从4分钟开始观看。\n",
    "\n",
    "\n",
    "\n",
    "**游戏规则**：  \n",
    "棋局开始时黑棋位于 E4 和 D5 ，白棋位于 D4 和 E5，如图所示。   \n",
    "\n",
    "<img src=\"http://imgbed.momodel.cn/white_and_black.jpg\" width=300>\n",
    "\n",
    "1. 黑方先行，双方交替下棋。\n",
    "2. 一步合法的棋步包括：\n",
    "  + 在一个空格处落下一个棋子，并且翻转对手一个或多个棋子；\n",
    "  + 新落下的棋子必须落在可夹住对方棋子的位置上，对方被夹住的所有棋子都要翻转过来，  \n",
    "    可以是横着夹，竖着夹，或是斜着夹。夹住的位置上必须全部是对手的棋子，不能有空格；  \n",
    "  + 一步棋可以在数个（横向，纵向，对角线）方向上翻棋，任何被夹住的棋子都必须被翻转过来，棋手无权选择不去翻某个棋子。  \n",
    "3. 如果一方没有合法棋步，也就是说不管他下到哪里，都不能至少翻转对手的一个棋子，那他这一轮只能弃权，而由他的对手继续落子直到他有合法棋步可下。\n",
    "4. 如果一方至少有一步合法棋步可下，他就必须落子，不得弃权。  \n",
    "5. 棋局持续下去，直到棋盘填满或者双方都无合法棋步可下。  \n",
    "6. 如果某一方落子时间超过 1 分钟 或者 连续落子 3 次不合法，则判该方失败。  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2 实验要求\n",
    "+ 使用 **『蒙特卡洛树搜索算法』** 实现 miniAlphaGo for Reversi。   \n",
    "+ 使用 Python 语言。\n",
    "+ 算法部分需要自己实现，不要使用现成的包、工具或者接口。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.3 注意事项\n",
    "+ Python 与 Python Package 的使用方式，可在右侧 `API文档` 中查阅。\n",
    "+ 在与人类玩家对奕时，运行环境将等待用户输入座标，此时代码将处于 While..Loop 回圈中，请务必输入'Q'离开，否则将持续系统将等待(hold）。\n",
    "+ 当右上角的『Python 3』长时间指示为运行中的时候，造成代码无法执行时，可以重新启动 Kernel 解决（左上角『Kernel』-『Restart Kernel』）。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.实验内容"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.1 棋盘介绍"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1.1 初始化棋盘"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "棋盘规格是 8x8，'X' 代表黑棋，'O' 代表白棋，'.' 代表未落子状态。\n",
    "  \n",
    "棋盘初始化 - 利用 Board 类（board.py）中的 `display()` 方法展示棋盘："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . . . . . .\n",
      "4 . . . O X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 2 / 0 / 0\n",
      "白   棋: 2 / 0 / 0\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 导入棋盘文件\n",
    "from board import Board\n",
    "\n",
    "# 初始化棋盘\n",
    "board = Board()\n",
    "\n",
    "# 打印初始化棋盘\n",
    "board.display()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1.2 棋盘与坐标之间的关系    \n",
    "\n",
    "||A|B|C|D|E|F|G|H|\n",
    "|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|\n",
    "|1|(0,0)|(0,1)|(0,2)|(0,3)|(0,4)|(0,5)|(0,6)|(0,7)|\n",
    "|2|(1,0)|(1,1)|(1,2)|(1,3)|(1,4)|(1,5)|(1,6)|(1,7)|\n",
    "|3|(2,0)|(2,1)|(2,2)|(2,3)|(2,4)|(2,5)|(2,6)|(2,7)|\n",
    "|4|(3,0)|(3,1)|(3,2)|(3,3)|(3,4)|(3,5)|(3,6)|(3,7)|\n",
    "|5|(4,0)|(4,1)|(4,2)|(4,3)|(4,4)|(4,5)|(4,6)|(4,7)|\n",
    "|6|(5,0)|(5,1)|(5,2)|(5,3)|(5,4)|(5,5)|(5,6)|(5,7)|\n",
    "|7|(6,0)|(6,1)|(6,2)|(6,3)|(6,4)|(6,5)|(6,6)|(6,7)|\n",
    "|8|(7,0)|(7,1)|(7,2)|(7,3)|(7,4)|(7,5)|(7,6)|(7,7)|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "棋盘坐标 E4, 转化为坐标形式就是 (3, 4), 坐标数值大小是从 0 开始，到 7 结束。  \n",
    "\n",
    "Board 类中，提供以上两种坐标的转化方法：\n",
    "+ `board_num(action)`: 棋盘坐标转化为数字坐标。\n",
    "    + action: 棋盘坐标，e.g. 'G6'\n",
    "    + 返回值: 数字坐标，e.g. (5, 6)\n",
    "+ `num_board(action)`: 数字坐标转化为棋盘坐标。\n",
    "    + action: 数字坐标，e.g. (2, 7)\n",
    "    + 返回值: 棋盘坐标，e.g. 'H3'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "D5\n",
      "(1, 6)\n"
     ]
    }
   ],
   "source": [
    "# 查看坐标 (4,3) 在棋盘上的位置 \n",
    "position = (4, 3)\n",
    "print(board.num_board(position))\n",
    "\n",
    "# 查看棋盘位置 'G2' 的坐标\n",
    "position = 'G2'\n",
    "print(board.board_num(position))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  \n",
    "### 2.1.3 Board 类中比较重要的方法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ `get_legal_actions(color)`： 根据黑白棋的规则获取 color 方棋子的合法落子坐标，用 `list()` 方法可以获取所有的合法坐标。\n",
    "    + color: 下棋方，'X' - 黑棋，'O' - 白棋\n",
    "    + 返回值: 合法的落子坐标列表  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['D3', 'C4', 'F5', 'E6']\n"
     ]
    }
   ],
   "source": [
    "# 棋盘初始化后，黑方可以落子的位置\n",
    "print(list(board.get_legal_actions('X')))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  \n",
    "+  `_move(action, color)`：  根据 color 落子坐标 action 获取翻转棋子的坐标。  \n",
    "    + action: 落子的坐标，e.g. 'C4'\n",
    "    + color: 下棋方，'X' - 黑棋，'O' - 白棋\n",
    "    + 返回值: 反转棋子棋盘坐标列表\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . X . . . .\n",
      "4 . . . X X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 4 / 0 / 0\n",
      "白   棋: 1 / 0 / 0\n",
      "\n",
      "False\n",
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . X . . . .\n",
      "4 . . . X X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 4 / 0 / 0\n",
      "白   棋: 1 / 0 / 0\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 打印初始化后的棋盘\n",
    "board.display()\n",
    "\n",
    "# 假设现在黑棋下棋，可以落子的位置有：['D3', 'C4', 'F5', 'E6']，\n",
    "# 黑棋落子 D3 , 则白棋被翻转的棋子是 D4。 \n",
    "\n",
    "# 表示黑棋\n",
    "color = 'X' \n",
    "\n",
    "# 落子坐标\n",
    "action = 'D3' \n",
    "\n",
    "# 打印白方被翻转的棋子位置\n",
    "print(board._move(action,color))\n",
    "\n",
    "# 打印棋盘\n",
    "board.display() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.2 创建随机玩家"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "deletable": false
   },
   "outputs": [],
   "source": [
    "# 导入随机包\n",
    "import random\n",
    "\n",
    "class RandomPlayer:\n",
    "    \"\"\"\n",
    "    随机玩家, 随机返回一个合法落子位置\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, color):\n",
    "        \"\"\"\n",
    "        玩家初始化\n",
    "        :param color: 下棋方，'X' - 黑棋，'O' - 白棋\n",
    "        \"\"\"\n",
    "        self.color = color\n",
    "        \n",
    "\n",
    "    def random_choice(self, board):\n",
    "        \"\"\"\n",
    "        从合法落子位置中随机选一个落子位置\n",
    "        :param board: 棋盘\n",
    "        :return: 随机合法落子位置, e.g. 'A1' \n",
    "        \"\"\"\n",
    "        # 用 list() 方法获取所有合法落子位置坐标列表\n",
    "        action_list = list(board.get_legal_actions(self.color))\n",
    "\n",
    "        # 如果 action_list 为空，则返回 None,否则从中选取一个随机元素，即合法的落子坐标\n",
    "        if len(action_list) == 0:\n",
    "            return None\n",
    "        else:\n",
    "            return random.choice(action_list)\n",
    "\n",
    "    def get_move(self, board):\n",
    "        \"\"\"\n",
    "        根据当前棋盘状态获取最佳落子位置\n",
    "        :param board: 棋盘\n",
    "        :return: action 最佳落子位置, e.g. 'A1'\n",
    "        \"\"\"\n",
    "        if self.color == 'X':\n",
    "            player_name = '黑棋'\n",
    "        else:\n",
    "            player_name = '白棋'\n",
    "        print(\"请等一会，对方 {}-{} 正在思考中...\".format(player_name, self.color))\n",
    "        action = self.random_choice(board)\n",
    "        return action"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "随机玩家 RandomPlayer 主要是随机获取一个合法落子位置。后续随机玩家可以跟人类玩家、AI 玩家等进行对弈。 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false
   },
   "source": [
    "随机玩家 `get_move()` 方法, 主要思路：\n",
    "+ 随机玩家的 `get_move()` 方法主要调用了 `random_choice()` 方法。  \n",
    "+ `random_choice()` 方法是：先用 `list()` 方法获取合法落子位置坐标列表，然后用 `random.choice()` 方法随机获取合法落子位置中的一个。   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "deletable": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . . . . . .\n",
      "4 . . . O X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 2 / 0 / 0\n",
      "白   棋: 2 / 0 / 0\n",
      "\n",
      "请等一会，对方 黑棋-X 正在思考中...\n",
      "黑棋玩家落子位置: C4\n",
      "黑棋落子后反转白棋的棋子坐标： ['D4']\n",
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . . . . . .\n",
      "4 . . X X X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 4 / 0 / 0\n",
      "白   棋: 1 / 0 / 0\n",
      "\n",
      "请等一会，对方 白棋-O 正在思考中...\n",
      "白棋玩家落子位置:C3\n",
      "白棋落子后反转黑棋的棋子坐标： ['D4']\n",
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . O . . . . .\n",
      "4 . . X O X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 3 / 0 / 0\n",
      "白   棋: 3 / 0 / 0\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 导入棋盘文件\n",
    "from board import Board\n",
    "\n",
    "# 棋盘初始化\n",
    "board = Board() \n",
    "\n",
    "# 打印初始化棋盘\n",
    "board.display() \n",
    "\n",
    "# 玩家初始化，输入黑棋玩家\n",
    "black_player = RandomPlayer(\"X\") \n",
    "\n",
    "# 黑棋玩家的随机落子位置\n",
    "black_action = black_player.get_move(board)  \n",
    "\n",
    "\n",
    "print(\"黑棋玩家落子位置: %s\"%(black_action))\n",
    "\n",
    "# 打印白方被翻转的棋子位置\n",
    "print(\"黑棋落子后反转白棋的棋子坐标：\",board._move(black_action,black_player.color))\n",
    "\n",
    "# 打印黑棋随机落子后的棋盘\n",
    "board.display() \n",
    "\n",
    "# 玩家初始化，输入白棋玩家\n",
    "white_player = RandomPlayer(\"O\") \n",
    "\n",
    "# 白棋玩家的随机落子位置\n",
    "white_action = white_player.get_move(board) \n",
    "\n",
    "print(\"白棋玩家落子位置:%s\"%(white_action))\n",
    "\n",
    "# 打印黑棋方被翻转的棋子位置\n",
    "print(\"白棋落子后反转黑棋的棋子坐标：\",board._move(white_action,white_player.color))\n",
    "\n",
    "# 打印白棋随机落子后的棋盘\n",
    "board.display() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.3 创建人类玩家\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "人类玩家 HumanPlayer 主要实现 `get_move()` 方法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "class HumanPlayer:\n",
    "    \"\"\"\n",
    "    人类玩家\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, color):\n",
    "        \"\"\"\n",
    "        玩家初始化\n",
    "        :param color: 下棋方，'X' - 黑棋，'O' - 白棋\n",
    "        \"\"\"\n",
    "        self.color = color\n",
    "    \n",
    "\n",
    "    def get_move(self, board):\n",
    "        \"\"\"\n",
    "        根据当前棋盘输入人类合法落子位置\n",
    "        :param board: 棋盘\n",
    "        :return: 人类下棋落子位置\n",
    "        \"\"\"\n",
    "        # 如果 self.color 是黑棋 \"X\",则 player 是 \"黑棋\"，否则是 \"白棋\"\n",
    "        if self.color == \"X\":\n",
    "            player = \"黑棋\"\n",
    "        else:\n",
    "            player = \"白棋\"\n",
    "\n",
    "        # 人类玩家输入落子位置，如果输入 'Q', 则返回 'Q'并结束比赛。\n",
    "        # 如果人类玩家输入棋盘位置，e.g. 'A1'，\n",
    "        # 首先判断输入是否正确，然后再判断是否符合黑白棋规则的落子位置\n",
    "        while True:\n",
    "            action = input(\n",
    "                    \"请'{}-{}'方输入一个合法的坐标(e.g. 'D3'，若不想进行，请务必输入'Q'结束游戏。): \".format(player,\n",
    "                                                                                 self.color))\n",
    "\n",
    "            # 如果人类玩家输入 Q 则表示想结束比赛\n",
    "            if action == \"Q\" or action == 'q':\n",
    "                return \"Q\"\n",
    "            else:\n",
    "                row, col = action[1].upper(), action[0].upper()\n",
    "\n",
    "                # 检查人类输入是否正确\n",
    "                if row in '12345678' and col in 'ABCDEFGH':\n",
    "                    # 检查人类输入是否为符合规则的可落子位置\n",
    "                    if action in board.get_legal_actions(self.color):\n",
    "                        return action\n",
    "                else:\n",
    "                    print(\"你的输入不合法，请重新输入!\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "人类玩家 `get_move()` 方法主要思路是：\n",
    "+ 人类玩家输入落子位置，如果输入'Q', 则返回 'Q' 并结束比赛。\n",
    "+ 如果人类玩家输入棋盘位置，e.g. 'A1'，首先判断输入是否正确，然后再判断是否符合黑白棋规则的落子位置。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . . . . . .\n",
      "4 . . . O X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 2 / 0 / 0\n",
      "白   棋: 2 / 0 / 0\n",
      "\n"
     ]
    },
    {
     "name": "stdin",
     "output_type": "stream",
     "text": [
      "请'黑棋-X'方输入一个合法的坐标(e.g. 'D3'，若不想进行，请务必输入'Q'结束游戏。):  D3\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "黑棋落子后反转白棋的棋子坐标： ['D4']\n",
      "  A B C D E F G H\n",
      "1 . . . . . . . .\n",
      "2 . . . . . . . .\n",
      "3 . . . X . . . .\n",
      "4 . . . X X . . .\n",
      "5 . . . X O . . .\n",
      "6 . . . . . . . .\n",
      "7 . . . . . . . .\n",
      "8 . . . . . . . .\n",
      "统计棋局: 棋子总数 / 每一步耗时 / 总时间 \n",
      "黑   棋: 4 / 0 / 0\n",
      "白   棋: 1 / 0 / 0\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 导入棋盘文件\n",
    "from board import Board\n",
    "\n",
    " # 棋盘初始化\n",
    "board = Board() \n",
    "\n",
    "# 打印初始化后棋盘\n",
    "board.display() \n",
    "\n",
    "# 人类玩家黑棋初始化\n",
    "black_player = HumanPlayer(\"X\") \n",
    "\n",
    "# 人类玩家黑棋落子位置\n",
    "action = black_player.get_move(board)\n",
    "\n",
    "\n",
    "# 如果人类玩家输入 'Q',则表示想结束比赛，\n",
    "# 现在只展示人类玩家的输入结果。\n",
    "if action == \"Q\":\n",
    "    print(\"结束游戏：\",action)\n",
    "else:\n",
    "    # 打印白方被翻转的棋子位置\n",
    "    print(\"黑棋落子后反转白棋的棋子坐标：\", board._move(action,black_player.color))\n",
    "\n",
    "# 打印人类玩家黑棋落子后的棋盘\n",
    "board.display() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.4 创建 Game 类\n",
    "\n",
    "该类主要实现黑白棋的对弈，已经实现随机玩家和人类玩家，现在可以来对弈一下。    \n",
    "Game 类（game.py）的主要方法和属性:  \n",
    "\n",
    "+ 属性：\n",
    "    + `self.board`：棋盘\n",
    "    + `self.current_player`：定义当前的下棋一方，考虑游戏还未开始我们定义为 None\n",
    "    + `self.black_player`：定义黑棋玩家 black_player\n",
    "    + `self.white_player`：定义白棋玩家 white_player\n",
    "\n",
    "    \n",
    "+ 方法：   \n",
    "    + `switch_player()`：下棋时切换玩家  \n",
    "    + `run()`：黑白棋游戏的主程序  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导入黑白棋文件\n",
    "from game import Game  \n",
    "\n",
    "# 人类玩家黑棋初始化\n",
    "black_player = HumanPlayer(\"X\")\n",
    "\n",
    "# 随机玩家白棋初始化\n",
    "white_player = RandomPlayer(\"O\")\n",
    "\n",
    "# 游戏初始化，第一个玩家是黑棋，第二个玩家是白棋\n",
    "game = Game(black_player, white_player)\n",
    "\n",
    "# 开始下棋\n",
    "game.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "考虑到人类下棋比较慢，我们直接采用随机玩家与随机玩家下棋,效果如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导入黑白棋文件\n",
    "from game import Game  \n",
    "\n",
    "# 随机玩家黑棋初始化\n",
    "black_player = RandomPlayer(\"X\")\n",
    "\n",
    "# 随机玩家白棋初始化\n",
    "white_player = RandomPlayer(\"O\")\n",
    "\n",
    "# 游戏初始化，第一个玩家是黑棋，第二个玩家是白棋\n",
    "game = Game(black_player, white_player)\n",
    "\n",
    "# 开始下棋\n",
    "game.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.5 创建 AI 玩家\n",
    "通过以上流程的介绍或者学习，相信大家一定很熟悉如何玩这个游戏。  \n",
    "现在 AI 玩家需要大家来完善！    \n",
    "该部分主要是需要大家使用 **『蒙特卡洛树搜索算法』** 来实现 miniAlphaGo for Reversi。  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "select": true
   },
   "outputs": [],
   "source": [
    "import random\n",
    "import copy\n",
    "from board import Board\n",
    "from Monte_TreeSerach import Node, Monte_tree, UCB\n",
    "import operator\n",
    "\n",
    "# 定义蒙特卡洛树里的节点状态转移方程\n",
    "def state_after_action(action, type, state):\n",
    "\n",
    "    state1 = copy.deepcopy(state)\n",
    "\n",
    "    # 表示这一局不落子\n",
    "    if action is None:\n",
    "        return state1\n",
    "    color = 'X' if type == 'min' else 'O'\n",
    "\n",
    "    # 得到落子后的新棋盘状态\n",
    "    state1._move(action=action, color=color)\n",
    "    return state1\n",
    "\n",
    "# 定义蒙特卡洛树里的可行action搜索方程\n",
    "def find_all_next(current):\n",
    "    color = 'X' if current.type == 'min' else 'O'\n",
    "    anti_color = 'O' if current.type == 'min' else 'X'\n",
    "\n",
    "    # 当前状态下棋方可行选择\n",
    "    l = list(current.state.get_legal_actions(color))\n",
    "    # 对方的可行选择\n",
    "    l_op = list(current.state.get_legal_actions(anti_color))\n",
    "    # 下棋方只能跳过的情况\n",
    "    if len(l) == 0 and len(l_op) > 0:\n",
    "        return [None]\n",
    "    # 下棋方下子\n",
    "    if len(l) > 0:\n",
    "        return l\n",
    "    # 游戏结束的情况\n",
    "    return None\n",
    "\n",
    "# 定义蒙特卡洛树里的奖励函数\n",
    "def get_reward(current):\n",
    "    w, _ = current.state.get_winner()\n",
    "    if w == 0:\n",
    "        # 黑棋赢\n",
    "        return 1\n",
    "    if w == 1:\n",
    "        # 白棋赢\n",
    "        return 0\n",
    "    if w == 2:\n",
    "        # 平局\n",
    "        return 0.5\n",
    "    \n",
    "\n",
    "class AIPlayer(object):\n",
    "    def __init__(self, color):\n",
    "        # AI 方的颜色\n",
    "        self.color = color\n",
    "        # 蒙特卡洛树\n",
    "        self.tree = None\n",
    "        # 当前棋盘\n",
    "        self.board = None\n",
    "\n",
    "    # 寻找蒙特卡洛树根节点的子节点中棋盘board对应的节点(若无则创建),将其变成根节点.\n",
    "    def change_root(self,board):\n",
    "        for child in self.tree.root[0].children:\n",
    "            if operator.eq(child.state._board,board._board):\n",
    "                self.tree.root.pop()\n",
    "                self.tree.root.append(child)\n",
    "                child.parent = None\n",
    "                self.tree.currents.pop()\n",
    "                self.tree.currents.append(child)\n",
    "                self.tree.all_visit = child.visit\n",
    "                return True\n",
    "        self.tree.root[0].append_child(board=board)\n",
    "        for child in self.tree.root[0].children:\n",
    "            if operator.eq(child.state._board,board._board):\n",
    "\n",
    "                self.tree.root.pop()\n",
    "                self.tree.root.append(child)\n",
    "                child.parent = None\n",
    "                self.tree.currents.pop()\n",
    "                self.tree.currents.append(child)\n",
    "                self.tree.all_visit = child.visit\n",
    "                return True\n",
    "        return False\n",
    "\n",
    "    # 初始化蒙特卡洛树\n",
    "    def generate_tree(self):\n",
    "        if self.tree is None:\n",
    "            type = 'min' if self.color == 'X' else 'max'\n",
    "            self.tree = Monte_tree(ori_state=self.board, state_after_action=state_after_action, find_all_next=find_all_next,\n",
    "                                   get_reward=get_reward, C=0.7, type=type)\n",
    "            return True\n",
    "        else:\n",
    "            self.change_root(self.board)\n",
    "        return False\n",
    "\n",
    "    def get_result(self):\n",
    "        self.generate_tree()\n",
    "\n",
    "        # UCTSearch进行蒙特卡洛搜索,最后的current[0]变成action对应的子节点.\n",
    "        action = self.tree.UCTSearch(time_limit=1)\n",
    "        self.change_root(self.tree.currents[0].state)\n",
    "        return action\n",
    "\n",
    "    def get_move(self, board):\n",
    "        # 复制一个副本\n",
    "        self.board = copy.deepcopy(board)\n",
    "        self.generate_tree()\n",
    "\n",
    "        # UCTSearch进行蒙特卡洛搜索,最后的current[0]变成action对应的子节点.\n",
    "        action = self.tree.UCTSearch(time_limit=1)\n",
    "        self.change_root(self.tree.currents[0].state)\n",
    "        return action\n",
    "# class AIPlayer:\n",
    "#     \"\"\"\n",
    "#     AI 玩家\n",
    "#     \"\"\"\n",
    "\n",
    "#     def __init__(self, color):\n",
    "#         \"\"\"\n",
    "#         玩家初始化\n",
    "#         :param color: 下棋方，'X' - 黑棋，'O' - 白棋\n",
    "#         \"\"\"\n",
    "\n",
    "#         self.color = color\n",
    "\n",
    "#     def get_move(self, board):\n",
    "#         \"\"\"\n",
    "#         根据当前棋盘状态获取最佳落子位置\n",
    "#         :param board: 棋盘\n",
    "#         :return: action 最佳落子位置, e.g. 'A1'\n",
    "#         \"\"\"\n",
    "#         if self.color == 'X':\n",
    "#             player_name = '黑棋'\n",
    "#         else:\n",
    "#             player_name = '白棋'\n",
    "#         print(\"请等一会，对方 {}-{} 正在思考中...\".format(player_name, self.color))\n",
    "\n",
    "#         # -----------------请实现你的算法代码--------------------------------------\n",
    "\n",
    "#         action = None\n",
    "#         # ------------------------------------------------------------------------\n",
    "\n",
    "#         return action\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以上就是 AI 玩家的初步代码，其中特别注意：\n",
    "1. **请不要修改get_move方法的输入和输出**。\n",
    "2. 可以添加 AIPlayer 的属性和方法。\n",
    "3. 完善算法时请注意落子时间：落子需要在 **60s** 之内！\n",
    "4. 落子 3 次不在合法范围内即判断该方失败, 故落子前请检查棋子的合法性。\n",
    "5. 提交作业时请导入必要的包和第三方库 (包括此文件中曾经导入过的)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.5.1 测试 AI 玩家 \n",
    "如果您已经实现 AIPlayer，你可以选人类玩家、随机玩家与 AIPlayer 算法对战，甚至 AIPlayer 与 AIPlayer 自己对战！"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导入黑白棋文件\n",
    "from game import Game  \n",
    "\n",
    "# 人类玩家黑棋初始化\n",
    "black_player =  HumanPlayer(\"X\")\n",
    "\n",
    "# AI 玩家 白棋初始化\n",
    "white_player = AIPlayer(\"O\")\n",
    "\n",
    "# 游戏初始化，第一个玩家是黑棋，第二个玩家是白棋\n",
    "game = Game(black_player, white_player)\n",
    "\n",
    "# 开始下棋\n",
    "game.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.5.2 作业提交  \n",
    "\n",
    "+ 经过`2.5.1 测试 AI 玩家`实现人类玩家与 AI 玩家对弈之后，在左侧 `提交作业` 的标签中，把整个 AIPlayer 转化为 main.py 文件进行`系统测试`。\n",
    "+ 你可以选择初级、中级或者高级对手进行对弈，对弈时请勾选 main.py 文件。  \n",
    "+ 能通过测试就可以**提交作业**。 "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  },
  "nbTranslate": {
   "displayLangs": [
    "fr",
    "en"
   ],
   "hotkey": "alt-t",
   "langInMainMenu": true,
   "sourceLang": "en",
   "targetLang": "fr",
   "useGoogleTranslate": true
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
